Part-of-Speech is (almost) enough: SAP Research & Innovation at the #Microposts2014 NEEL Challenge

نویسندگان

  • Daniel Dahlmeier
  • Naveen Nandan
  • Wang Ting
چکیده

This paper describes the submission of the SAP Research & Innovation team at the #Microposts2014 NEEL Challenge. We use a two-stage approach for named entity extraction and linking, based on conditional random fields and an ensemble of search APIs and rules, respectively. A surprising result of our work is that part-of-speech tags alone are almost sufficient for entity extraction. Our results for the combined extraction and linking task on a development and test split of the training set are 34.6% and 37.2% F1 score, respectively, and for the test set is 37%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Making Sense of Microposts (#Microposts2014) Named Entity Extraction & Linking Challenge

Microposts are small fragments of social media content and a popular medium for sharing facts, opinions and emotions. They comprise a wealth of data which is increasing exponentially, and which therefore presents new challenges for the information extraction community, among others. This paper describes the ‘Making Sense of Microposts’ (#Microposts2014) Workshop’s Named Entity Extraction and Li...

متن کامل

The Open University ’ s repository of research publications and other research outputs Making sense of microposts : ( # Microposts 2014 ) named entity extraction & linking challenge

Microposts are small fragments of social media content and a popular medium for sharing facts, opinions and emotions. They comprise a wealth of data which is increasing exponentially, and which therefore presents new challenges for the information extraction community, among others. This paper describes the ‘Making Sense of Microposts’ (#Microposts2014) Workshop’s Named Entity Extraction and Li...

متن کامل

DataTXT at #Microposts2014 Challenge

In this paper we describe the approach taken for the “Making Sense of Microposts challenge 2014” (#Microposts2014), where participants were asked to cross reference micro-posts extracted from Twitter with DBpedia URIs belonging to a given taxonomy. For this task we deployed dataTXT which is the evolution of Tagme[3], the state-of-the-art topic annotator for short texts and which has proven to b...

متن کامل

Design and Implementation of an Intelligent Part of Speech Generator

The aim of this paper is to report on an attempt to design and implement an intelligent system capable of generating the correct part of speech for a given sentence while the sentence is totally new to the system and not stored in any database available to the system. It follows the same steps a normal individual does to provide the correct parts of speech using a natural language processor. It...

متن کامل

A Study of the Features and Functions of speech Perseverance (With an Emphasis on the Alavi Teachings)

The serious challenge that contemporary human is encountered with has been brought about by the lack of applying ethical and behavioral necessities in his life rather than by the weakness of the rules or lack of technology. One of the mentioned important necessities is the factor of speech perseverance which has a particular conceptual and meaningful weight that is the adducing of the right spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014